skip to main content
10.1145/3519939.3523711acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

WebRobot: web robotic process automation using interactive programming-by-demonstration

Published:09 June 2022Publication History

ABSTRACT

It is imperative to democratize robotic process automation (RPA), as RPA has become a main driver of the digital transformation but is still technically very demanding to construct, especially for non-experts. In this paper, we study how to automate an important class of RPA tasks, dubbed web RPA, which are concerned with constructing software bots that automate interactions across data and a web browser. Our main contributions are twofold. First, we develop a formal foundation which allows semantically reasoning about web RPA programs and formulate its synthesis problem in a principled manner. Second, we propose a web RPA program synthesis algorithm based on a new idea called speculative rewriting. This leads to a novel speculate-and-validate methodology in the context of rewrite-based program synthesis, which has also shown to be both theoretically simple and practically efficient for synthesizing programs from demonstrations. We have built these ideas in a new interactive synthesizer called WebRobot and evaluate it on 76 web RPA benchmarks. Our results show that WebRobot automated a majority of them effectively. Furthermore, we show that WebRobot compares favorably with a conventional rewrite-based synthesis baseline implemented using egg. Finally, we conduct a small user study demonstrating WebRobot is also usable.

References

  1. Cypress Studio. https://docs.cypress.io/guides/core-concepts/cypress-studioGoogle ScholarGoogle Scholar
  2. iMacros. https://www.progress.com/imacrosGoogle ScholarGoogle Scholar
  3. Robotic Process Automation (RPA). https://searchcio.techtarget.com/definition/RPAGoogle ScholarGoogle Scholar
  4. Selenium IDE. https://www.selenium.dev/selenium-ide/Google ScholarGoogle Scholar
  5. The Remarkable History of Robotic Process Automation (RPA). https://nandan.info/history-of-robotic-process-automation-rpa/Google ScholarGoogle Scholar
  6. UiPath Webinar. https://www.uipath.com/webinar-recording/your-own-idea-robot-studiox?mkt_tok=OTk1LVhMVC04ODYAAAF8uBLrLqPW-QJHu_Hj1dkXeqK4JMZymY9EGBLkwL_2fSN8Kj2iwc09MVhHrBjf7PUkFUKBfYX-x-85mrFVUXZf2LawwpNcRPLTEDaZ9NM1Google ScholarGoogle Scholar
  7. UiPath Webinar Slides. https://start.uipath.com/rs/995-XLT-886/images/StudioX_Webinar.pdfGoogle ScholarGoogle Scholar
  8. XPath. https://en.wikipedia.org/wiki/XPathGoogle ScholarGoogle Scholar
  9. Simone Agostinelli, Andrea Marrella, and Massimo Mecella. 2020. Towards Intelligent Robotic Process Automation for BPMers. arXiv preprint arXiv:2001.00804.Google ScholarGoogle Scholar
  10. Tobias Anton. 2005. XPath-Wrapper Induction by Generalizing Tree Traversal Patterns. In Lernen, Wissensentdeckung und Adaptivitt (LWA) 2005, GI Workshops, Saarbrcken. 126–133.Google ScholarGoogle Scholar
  11. Shaon Barman, Sarah Chasins, Rastislav Bodik, and Sumit Gulwani. 2016. Ringer: Web Automation by Demonstration. In Proceedings of the 2016 ACM SIGPLAN international conference on object-oriented programming, systems, languages, and applications. 748–764.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Daniel W Barowy, Sumit Gulwani, Ted Hart, and Benjamin Zorn. 2015. FlashRelate: Extracting Relational Data from Semi-structured Spreadsheets Using Examples. ACM SIGPLAN Notices, 50, 6 (2015), 218–228.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alexander Baumgartner and Temur Kutsia. 2014. Unranked second-order anti-unification. In International Workshop on Logic, Language, Information, and Computation. 66–80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alexander Baumgartner, Temur Kutsia, Jordi Levy, and Mateu Villaret. 2017. Higher-order pattern anti-unification in linear time. Journal of Automated Reasoning, 58, 2 (2017), 293–310.Google ScholarGoogle ScholarCross RefCross Ref
  15. James M Boyle, Terence J Harmer, and Victor L Winter. 1997. The TAMPR program transformation system: Simplifying the development of numerical software. In Modern software tools for scientific computing. Springer, 353–372.Google ScholarGoogle Scholar
  16. Sarah Chasins, Shaon Barman, Rastislav Bodik, and Sumit Gulwani. 2015. Browser Record and Replay as a Building Block for End-User Web Automation Tools. In Proceedings of the 24th International Conference on World Wide Web. 179–182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sarah Elizabeth Chasins. 2019. Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts. Ph.D. Dissertation. UC Berkeley.Google ScholarGoogle Scholar
  18. Sarah E Chasins, Maria Mueller, and Rastislav Bodik. 2018. Rousillon: Scraping Distributed Hierarchical Web Data. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 963–975.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yan Chen, Jaylin Herskovitz, Walter S Lasecki, and Steve Oney. 2020. Bashon: A Hybrid Crowd-Machine Workflow for Shell Command Synthesis. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 1–8.Google ScholarGoogle Scholar
  20. Miles Claver, Jordan Schmerge, Jackson Garner, Jake Vossen, and Jedidiah McClurg. 2021. ReGiS: Regular Expression Simplification via Rewrite-Guided Synthesis. arXiv preprint arXiv:2104.12039.Google ScholarGoogle Scholar
  21. Nachum Dershowitz and Jean-Pierre Jouannaud. 1990. Rewrite systems. In Formal models and semantics. Elsevier, 243–320.Google ScholarGoogle Scholar
  22. Rui Dong, Zhicheng Huang, Ian Iong Lam, Yan Chen, and Xinyu Wang. 2022. WebRobot: Web Robotic Process Automation using Interactive Programming-by-Demonstration (Extended Version). http://arxiv.org/abs/2203.09993.Google ScholarGoogle Scholar
  23. Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: interactive program synthesis with control structures. Proceedings of the ACM on Programming Languages, 5, OOPSLA (2021), 1–29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kasra Ferdowsifard, Allen Ordookhanians, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2020. Small-Step Live Programming by Example. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 614–626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Michael H Fischer, Giovanni Campagna, Euirim Choi, and Monica S Lam. 2021. DIY Assistant: A Multi-Modal End-User Programmable Virtual Assistant. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 312–327.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Pankaj Gulhane, Amit Madaan, Rupesh Mehta, Jeyashankher Ramamirtham, Rajeev Rastogi, Sandeep Satpal, Srinivasan H Sengamedu, Ashwin Tengli, and Charu Tiwari. 2011. Web-scale information extraction with vertex. In 2011 IEEE 27th International Conference on Data Engineering. 1209–1220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-Output Examples. ACM Sigplan Notices, 46, 1 (2011), 317–330.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rajeev Joshi, Greg Nelson, and Keith Randall. 2002. Denali: A goal-directed superoptimizer. ACM SIGPLAN Notices, 37, 5 (2002), 304–314.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive Visual Specification of Data Transformation Scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3363–3372.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tessa Lau, Steven A Wolfman, Pedro Domingos, and Daniel S Weld. 2003. Programming by Demonstration Using Version Space Algebra. Machine Learning, 53, 1 (2003), 111–156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Tessa Ann Lau. 2001. Programming by demonstration: a machine learning approach. University of Washington.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tessa A Lau and Daniel S Weld. 1998. Programming by Demonstration: An Inductive Learning Formulation. In Proceedings of the 4th international conference on Intelligent user interfaces. 145–152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Vu Le and Sumit Gulwani. 2014. FlashExtract: A Framework for Data Extraction by Examples. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. 542–553.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Volodymyr Leno, Adriano Augusto, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, and Artem Polyvyanyy. 2021. Discovering Executable Routine Specifications from User Interaction Logs. arXiv preprint arXiv:2106.13446.Google ScholarGoogle Scholar
  35. Volodymyr Leno, Stanislav Deviatykh, Artem Polyvyanyy, Marcello La Rosa, Marlon Dumas, and Fabrizio Maria Maggi. 2020. Robidium: Automated Synthesis of Robotic Process Automation Scripts from UI Logs.Google ScholarGoogle Scholar
  36. Gilly Leshed, Eben M Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-To Knowledge in the Enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1719–1728.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Henry Lieberman. 1993. Tinker: A programming by demonstration system for beginning programmers. In Watch what I do: programming by demonstration. 49–64.Google ScholarGoogle Scholar
  38. James Lin, Jeffrey Wong, Jeffrey Nichols, Allen Cypher, and Tessa A Lau. 2009. End-User Programming of Mashups with Vegemite. In Proceedings of the 14th international conference on Intelligent user interfaces. 97–106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Greg Little, Tessa A Lau, Allen Cypher, James Lin, Eben M Haber, and Eser Kandogan. 2007. Koala: Capture, Share, Automate, Personalize Business Processes on the Web. In Proceedings of the SIGCHI conference on Human factors in computing systems. 943–946.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Toshiyuki Masui and Ken Nakayama. 1994. Repeat and Predict - Two Keys to Efficient Text Editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 118–130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Dan Hua Mo. 1990. Learning Text Editing Procedures from Examples.Google ScholarGoogle Scholar
  42. Aaditya Naik, Jonathan Mendelson, Nathaniel Sands, Yuepeng Wang, Mayur Naik, and Mukund Raghothaman. 2021. Sporq: An Interactive Environment for Exploring Code using Query-by-Example. In The 34th Annual ACM Symposium on User Interface Software and Technology. 84–99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Chandrakana Nandi, Max Willsey, Adam Anderson, James R Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. 2020. Synthesizing structured CAD models with equality saturation and inverse transformations. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 31–44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Julie L Newcomb and Rastislav Bodik. 2019. Using human-in-the-loop synthesis to author functional reactive programs. arXiv preprint arXiv:1909.11206.Google ScholarGoogle Scholar
  45. Don Norman. 2013. The design of everyday things: Revised and expanded edition. Basic books.Google ScholarGoogle Scholar
  46. Besmira Nushi, Ece Kamar, Eric Horvitz, and Donald Kossmann. 2017. On human intellect and machine failures: Troubleshooting integrative machine learning systems. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  47. Shankara Pailoor, Yuepeng Wang, Xinyu Wang, and Isil Dillig. 2021. Synthesizing data structure refinements from integrity constraints. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 574–587.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Pavel Panchekha, Alex Sanchez-Stern, James R Wilcox, and Zachary Tatlock. 2015. Automatically improving accuracy for floating point expressions. ACM SIGPLAN Notices, 50, 6 (2015), 1–11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Varot Premtoon, James Koppel, and Armando Solar-Lezama. 2020. Semantic code search via equational reasoning. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 1066–1082.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Saikat Ray, Arthur Villa, Naved Rashid, Paul Vincent, Keith Guttridge, and Melanie Alexander. 2021. Magic Quadrant for Robotic Process Automation. https://www.gartner.com/doc/reprints?id=1-26Q65VFT&ct=210706&st=sbGoogle ScholarGoogle Scholar
  51. Mohammad Raza and Sumit Gulwani. 2020. Web Data Extraction using Hybrid Program Synthesis: A Combination of Top-down and Bottom-up Inference. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1967–1978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Mark Santolucito, William T Hallahan, and Ruzica Piskac. 2019. Live programming by example. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Kensen Shi, Jacob Steinhardt, and Percy Liang. 2019. Frangel: component-based synthesis with control structures. Proceedings of the ACM on Programming Languages, 3, POPL (2019), 1–29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Calvin Smith and Aws Albarghouthi. 2019. Program synthesis with equivalence reduction. In International Conference on Verification, Model Checking, and Abstract Interpretation. 24–47.Google ScholarGoogle ScholarCross RefCross Ref
  55. Armando Solar-Lezama. 2008. Program synthesis by sketching. University of California, Berkeley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Reudismam Sousa, Gustavo Soares, Rohit Gheyi, Titus Barik, and Loris D’Antoni. 2021. Learning Quick Fixes from Code Repositories. In Brazilian Symposium on Software Engineering. 74–83.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality saturation: a new approach to optimization. In Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 264–276.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Alexa VanHattum, Rachit Nigam, Vincent T Lee, James Bornholt, and Adrian Sampson. [n.d.]. Vectorization for Digital Signal Processors via Equality Saturation Extended Abstract.Google ScholarGoogle Scholar
  59. Eelco Visser, Zine-el-Abidine Benaissa, and Andrew Tolmach. 1998. Building program optimizers with rewriting strategies. ACM Sigplan Notices, 34, 1 (1998), 13–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, and Isil Dillig. 2019. Visualization by example. Proceedings of the ACM on Programming Languages, 4, POPL (2019), 1–28.Google ScholarGoogle Scholar
  61. Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, and Amy J Ko. 2021. Falx: Synthesis-Powered Visualization Authoring. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Judith Wewerka and Manfred Reichert. 2020. Robotic Process Automation–A Systematic Literature Review and Assessment Framework. arXiv preprint arXiv:2012.11951.Google ScholarGoogle Scholar
  63. Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. 2021. egg: Fast and Extensible Equality Saturation. Proceedings of the ACM on Programming Languages, 5, POPL (2021), 1–29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yichen Yang, Phitchaya Phothilimthana, Yisu Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality saturation for tensor graph superoptimization. Proceedings of Machine Learning and Systems, 3 (2021), 255–268.Google ScholarGoogle Scholar
  65. Dell Zhang, Alexander Kuhnle, Julian Richardson, and Murat Sensoy. 2020. Process Discovery for Structured Program Synthesis. arXiv preprint arXiv:2008.05804.Google ScholarGoogle Scholar
  66. Tianyi Zhang, London Lowmanstone, Xinyu Wang, and Elena L Glassman. 2020. Interactive Program Synthesis by Augmented Examples. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 627–648.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. WebRobot: web robotic process automation using interactive programming-by-demonstration

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation
      June 2022
      1038 pages
      ISBN:9781450392655
      DOI:10.1145/3519939
      • General Chair:
      • Ranjit Jhala,
      • Program Chair:
      • Işil Dillig

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 June 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate406of2,067submissions,20%

      Upcoming Conference

      PLDI '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader